Find below an internally generated list of new and/or notable data ecosystem companies and open source projects. Each company is contextualized with open source engagement data, along with general traffic and search interest data, all where available.
Note: This tracker is a WIP, and will be updated and expanded regularly.
Homepage: https://flyte.org/
Category: Workflow
Github: https://github.com/lyft/flyte
Status: In-house Open Source Project (Lyft)
Homepage: https://marquezproject.github.io/marquez/
Category: Metadata Management
Github: https://github.com/MarquezProject
Status: Open Source Project
Homepage: https://www.superwise.ai/
Category: ML Monitoring
Github:
Status: Standalone Company
Homepage: https://www.prefect.io/
Category: Workflow
Github: https://github.com/PrefectHQ
Status: Standalone Company
Homepage: https://ray.io/
Category: Query Engine
Github: https://github.com/ray-project
Status: Open Source Project
Homepage: https://www.getdbt.com/
Category: Data Transform
Github:
Status: Standalone Company
Homepage: https://dataform.co/
Category: Data Transform
Github: https://github.com/dataform-co
Status: Standalone Company
Homepage: https://www.monalabs.io/
Category: ML Monitoring
Github:
Status: Standalone Company
Homepage: https://www.seldon.io/
Category: ML Ops
Github: https://github.com/SeldonIO
Status: Standalone Company
Homepage: https://www.arthur.ai/
Category: ML Monitoring
Github:
Status: Standalone Company
Homepage: https://databand.ai/
Category: Data Quality
Github: https://github.com/databand-ai/
Status: Standalone Company
Homepage: https://www.soda.io/
Category: Data Quality
Github: https://github.com/sodafoundation
Status: Standalone Company
Homepage: https://www.proximo.com/
Category: Data Quality
Github:
Status: Standalone Company
Homepage: https://www.fiddler.ai/
Category: ML Monitoring
Github: https://github.com/fiddler-labs
Status: Standalone Company
Homepage: https://mlflow.org/
Category: ML Ops
Github: https://github.com/mlflow
Status: Open Source Project
Homepage: https://www.bentoml.ai/
Category: ML Ops
Github: https://github.com/bentoml
Status: Standalone Company
Homepage: https://www.montecarlodata.com/
Category: Data Quality
Github:
Status: Standalone Company
Homepage: https://www.grid.ai/
Category:
Github: https://github.com/PyTorchLightning
Status: Standalone Company
Homepage: https://dagster.io/
Category: Workflow
Github: https://github.com/dagster-io
Status: Open Source Project
Homepage: https://www.pachyderm.com/
Category: ML Ops
Github: https://github.com/pachyderm
Status: Standalone Company
Homepage: https://torodata.io/
Category: Data Quality
Github:
Status: Standalone Company
Homepage: https://prestodb.io/
Category: Query Engine
Github: https://github.com/prestodb
Status: Open Source Project
Homepage: https://www.snorkel.org/
Category: ML Ops
Github: https://github.com/snorkel-team
Status: Open Source Project
Homepage: https://determined.ai/
Category: ML Ops
Github: https://github.com/determined-ai
Status: Standalone Company
Homepage: https://www.datafold.com/
Category: Data Quality
Github:
Status: Standalone Company
Homepage: https://www.amundsen.io/
Category: Metadata Management
Github: https://github.com/amundsen-io
Status: In-house Open Source Project (Lyft)
Homepage: https://engineering.linkedin.com/blog/2019/data-hub
Category: Metadata Management
Github: https://github.com/linkedin/datahub
Status: In-house Open Source Project (LinkedIn)
Homepage: https://www.deepchecks.com/
Category: ML Monitoring
Github:
Status: Standalone Company
Homepage: https://metaflow.org/
Category: Workflow
Github: https://github.com/Netflix/metaflow
Status: In-house Open Source Project (Netflix)
Homepage: https://monitorml.com/
Category: ML Monitoring
Github:
Status: Standalone Company
Homepage: https://www.starburstdata.com/
Category: Query Engine
Github: https://github.com/starburstdata
Status: Standalone Company
Homepage: https://airflow.apache.org/
Category: Workflow
Github: https://github.com/apache/airflow
Status: Open Source Project
Homepage: https://druid.apache.org/
Category: Query Engine
Github: https://github.com/apache/druid
Status: Open Source Project
Homepage: https://pinot.apache.org/
Category: Query Engine
Github: https://github.com/apache/incubator-pinot
Status: Open Source Project
Homepage: https://spark.apache.org/
Category: Query Engine
Github: https://github.com/apache/spark
Status: Open Source Project
Homepage: https://eng.uber.com/databook/
Category: Metadata Management
Github:
Status: In-house Open Source Project (Uber)